Proteins are composed of chains of amino acids, which fold into a three-dimensional shape that, in turn, determines the function of the protein.

Researchers at the University of Toronto, Canada, have developed an artificial intelligence system that can use generative diffusion to create proteins that do not exist in nature. The system promises to make the design and testing of therapeutic proteins more efficient and flexible, thereby accelerating human drug development.

Proteins are composed of chains of amino acids that fold into three-dimensional shapes that, in turn, determine the function of the protein. These folded three-dimensional shapes have evolved over billions of years and are diverse and complex, but their number is limited. Researchers have therefore begun to try to design folding patterns that do not arise naturally.

The main challenge in this research is the "imagination" of folding, because it is difficult to predict which folds are real and play a role in the protein structure. By combining a biophysics-based representation of protein structures with a diffusion approach to image generation space, scientists have found a way to solve this problem by creating a new system called ProteinSGM.

The model learns from the image representation and generates entirely new proteins at a very high rate. In addition to the challenges of optimizing the image generation process, validating the proteins generated by the system was difficult because many of the structures produced by the system are different from any found in nature, the researchers said.

According to the metrics, almost all of the resulting structures looked reasonable, but the researchers needed further evidence. They turned to the artificial intelligence "Omega Fold" (an improved version of Deep Thought's "Alpha Fold 2"), which tested and confirmed that almost all of the new sequences folded into the desired new protein structures. Complemented by physical tests in the lab, the researchers were finally convinced that these were the correct proteins to fold.